Main
Anderson Banihirwe
I contribute to and maintain several libraries within the open source scientific Python stack, particularly around improving scalability of Python tools in order to (1) handle large scale datasets on High Performance Computing and Cloud Computing platforms and (2) move the open science paradigm forward.
Education
B.S., Computer Systems Engineering
University of Arkansas at Little Rock
Little Rock, AR
2018 - 2014
Professional Experience
Software Engineer
National Center for Atmospheric Research
Boulder, CO
current - 2018-10
- Contributed and helped maintain the core software stack powering the Pangeo project. Software projects I contributed to include: dask, intake.
- Developed and maintained Xarray, an open source library for working with multidimensional, labeled datasets and arrays in Python.
- Created and maintained intake-ESM, a Python data cataloguing package for exploring and ingesting earth system model data sets.
- Developed and delivered weekly, live (virtual and in-person), self-paced technical tutorials to NCAR scientists and their collaborators.
Software Developer Intern
Quansight
Austin, TX
2018-09 - 2018-05
- Developed xndframes, a Pandas ExtensionDtype/Array backed by xnd, a container type that maps most Python values relevant for scientific computing directly to typed memory.
- Worked on integrating cuDF - GPU dataframe library with Apache Arrow library.
Data Science Intern
First Orion
Little Rock, AR
2018-04 - 2017-11
- Built scoring, predictive models with Scikit-learn, Dask, and Apache Spark using First Orion’s proprietary telecommunication data.
Research Intern
National Center for Atmospheric Research
Boulder, CO
2017-08 - 2017-05
- Developed spark-xarray, a Python package that integrates PySpark and xarray for climate data analysis.
Selected Publications, Posters, and Talks
Building Tools for the Scientific Python Community
12th Symposium on Advances in Modeling and Analysis Using Python at 2022 AMS Annual Meeting
Online
2022-01
- Invited Keynote talk.
The current State of Deploying Dask on HPC Systems
2021 Dask Developer Summit
Online
2021-05
- Contributed talk.
Cloud-Native Repositories for Big Scientific Data
Computing in Science and Engineering
N/A
2020-11
- Authored with Ryan Abernathey, Tom Augspurger, et al.
Pangeo Benchmarking Analysis: Object Storage vs. POSIX File System
5th International Parallel Data Systems Workshop at 2020 Supercomputing Conference
N/A
2020-10
- Authored with Haiying Xu, Kevin Paul.
The Pangeo Ecosystem: Interactive Computing Tools for the Geosciences: Benchmarking on HPC
2019 Supercomputing Conference Workshop on Interactive High-Performance Computing
N/A
2020-01
- Authored with Tina Erica Odaka, Guillaume Eynard-Bontemps, Aurelien Ponte, Guillaume Maze, Kevin Paul, Jared Baker, Ryan Abernathey.
Pangeo Use Case: Analyzing Initialized Climate Prediction System Datasets with climpred
NOAA’s 45th Climate Diagnostics & Prediction Workshop
Online
2020-10
- Invited talk about climpred, a Python package for weather and climate forecasts.
Zarr: chunked, compressed, multidimensional arrays
2020 Cloud Native Geospatial Outreach Day
Online
2020-09
- Invited talk about Zarr, an open source data format for the storage of chunked, compressed, multidimensional arrays.
Intake-ESM – Making It Easier To Consume Climate and Weather Data
2020 ESIP Summer Meeting
Online
2020-07
- Invited talk about intake-esm, an intake plugin for working with Earth System Model (ESM) datasets.
Dask and Pangeo
2020 Dask Developer Summit
Washington, D.C.
2020-02
- Invited talk.
Interactive Supercomputing with Dask and Jupyter
2019 Scientific Computing with Python conference
Austin, TX
2019-07
- Contributed talk about Dask and Jupyter.
Beyond Matplotlib - Tutorial: Building Interactive Climate Data Visualizations with Bokeh and Friends
2018 UCAR Software Engineering Assembly conference
Boulder, CO
2018-04
- Contributed tutorial about interactive visualization with Python.
PySpark for “Big” Atmospheric Data Analysis
8th Symposium on Advances in Modeling and Analysis Using Python at 2018 AMS Annual Meeting
Austin, TX
2018-01
- Contributed talk about spark-xarray.